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ABSTRACT 

Hyperspectral images can be represented either as a set of 
images or as a set of spectra. Spectral classification and seg- 
mentation and data reduction are the main problems in hyper- 
spectral image analysis. In this paper we propose a Bayesian 
estimation approach with an appropriate hiearchical model 
with hidden markovian variables which gives the possibility 
to jointly do data reduction, spectral classification and image 
segmentation. In the proposed model, the desired indepen- 
dent components are piecewise homogeneous images which 
share the same common hidden segmentation variable. Thus, 
the joint Bayesian estimation of this hidden variable as well 
as the sources and the mixing matrix of the source separation 
problem gives a solution for all the three problems of dimen- 
sionality reduction, spectra classification and segmentation of 
hyperspectral images. A few simulation results illustrate the 
performances of the proposed method compared to other clas- 
sical methods usually used in hyperspectral image processing. 

1. INTRODUCTION 

Hyperspectral images data can be represented either as a set 
of images Xoj{r) or as a set of spectra Xr{oj) where uj ^ 
indexes the wavelength and r G 7^ a pixel position d 12 O . 
In both representations, the data are dependent in both spa- 
tial positions and in spectral wavelength variable. Classical 
methods of hyperspectral image analysis try either to clas- 
sify the spectra x^{r) in K classes {ak{oo)^ k = 1, • • • , K} 
or to classify the images Xuj{r) in K classes {sk{r)^j = 
1, • • • , N}, using the classical classification methods such as 
distance based methods (like iiT-means) or probabilistic meth- 
ods using the mixture of Gaussian (MoG) modeling of the 
data. These methods thus either neglect the spatial structure 
of the spectra or the spectral natures of the pixels along the 
wavelength bands. 

The dimensionality reduction problem in hyperspectral im- 
ages can be written as: 



where the ak{uj) are the K spectral source components and 
Sk{r) are their associated images. 

This relation, when discretized, can be written as follows: 



x{r) = As{r) + e{r) 



(2) 



K 

Xr{uj) = ^ Sk{r) ak{uj) + er{uj), 



(1) 



x{r) = {xi{r)^ i = 1, • • • , M} is the set of M observed im- 
ages in different bands coi, A is the mixing matrix of dimen- 
sions (M, K) whose columns are composed of the spectra 
akioo), s{r) = {sk{r)^ k = 1, • • • , K} is the set of K un- 
known components (source images) and e{r) = {ei{r),i = 
1, • • • , m} represents the errors. 

The main objective in unsupervised classification of the 
spectra is to find both the spectra ak{uj) and their associated 
image components Sk{r). This problem, written as in equa- 
tion ^ is recognized as the Blind Source Separation (BSS) in 
signal processing community, for which, many general solu- 
tions such as Principal Components Analysis (PCA) and In- 
dependent Components Analysis (ICA) have been proposed. 
However these general purpose methods do not account for 
the specificity of the hyperspectral images. 

Indeed, as we mentioned, neither the classical methods of 
spectra or images classification nor the PCA and ICA meth- 
ods of BSS give satisfactory results for hyperspectral images. 
The reasons are that, in the first category of methods either 
they account for spatial or for spectral properties and not for 
both of them simultaneously, and PCA and ICA methods do 
not account for the specificity of the mixing matrix and the 
sources. 

In this paper, we propose to use this specificity of the hy- 
perspectral images and consider the dimensionality reduction 
problem as the blind sources separation (BSS) of equation [2] 
and use a Bayesian estimation framework with a hierarchical 
model for the sources with a common hidden classification 
variable which is modelled as a Potts-Markov field. The joint 
estimation of this hidden variable, the sources and the mixing 
matrix of the BSS problem gives a solution for all the three 
problems of dimensionality reduction, spectra classification 
and segmentation of hyperspectral images. 



2. PROPOSED MODEL AND METHOD 

We propose to consider the equation © written in the follow- 
ing vector form: 

x = As^e (3) 

where we used x = {x(r),r G 1Z}, s = {s(r),r G 1Z} 
and € = {e{r)^r G 1Z} and we are going to account for the 
specificity of the hyperspectral images through a probabilis- 
tic modeling of all the unknowns, starting by assuming that 
the errors e{r) are centered, white, Gaussian with co variance 
matrix Xle = diag [a^^ , .., a^^] . This leads to 



(4) 



The next step is to model the sources. As we mentioned in 
the introduction, we want to impose to all these sources s{r) 
to be piecewise homogeneous and share the same common 
segmentation, where the pixels in each region are considered 
to be homogeneous and associated to a particular spectrum 
representing the type of the material in that region. We also 
want that those spectra be classified in K distinct classes, 
thus all the pixels in regions associated with a particular spec- 
trum share some common statistical parameters. This can be 
achieved through the introduction of a discrete valued hidden 
variable z{r) representing the labels associated to each type 
of material and thus assuming the following: 

p{sj{r)\z{r) = k))=U{mj„a]j, k=l,---,K (5) 

with the following Potts-Markov field model 



p{z) oc exp 



/3E E S{z{r) - z{r')) 

r r'eV{r) 



(6) 



where z = {z{r),r G 1Z} represents the common segmenta- 
tion of the sources and the data. The parameter /3 controls the 
mean size of those regions. 

We may note that, assuming a priori that the sources are 
mutually independent and that pixels in each class k are inde- 
pendent form those of class k\ we have 

p{s\z) = E E Ep(^i('')i^w = ^)) 

k relZk j 

where 1Zk = {r : z{r) = k} and 1Z = UkTlk- 

To insure that each image Sj (r) is only non-zero in those 
regions associated with the kth spectrum, we impose K = n 



and m 
write 



'Jk 



0, Vj 7^ k and crj^ = 0, Vj 7^ k. We may then 



pU\z) = ^p{s{r)\z{r) = k)) = ^ AA(mfc(r), Sfc(r)) 

r r 

(8) 

where mk{r) is a vector of size n with all elements equal 
to zero except the /c-th element k = z{r) and S)fe(^) is a 



diagonal matrix of size n x n with all elements equal to zero 
except the k-\h main diagonal element where k = z{r). 

Combining the observed data model ([3]) and the sources 
model © of the previous section, we obtain the following 
hierarchical model: 

[ [ [ [ [ I n 1 1 
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Fig. 1. Proposed hierarchical model for hyperspectral images: 
the sources Sj{r) are hidden variables for the data Xi{r) and 
the common classification and segmentation variables z{r) is 
a hidden variable for the sources. 

3. BAYESIAN ESTIMATION FRAMEWORK 

Using the prior data model (O, the prior source model Q and 
the prior Potts-Markov model ([5]) and also assigning appropri- 
ate prior probability laws p{A) and p{0) to the hyperparam- 
eters 6_ = {^e, Og} where 6^ = and Og = {(m^^, a^^)}, 
we obtain an expression for the posterior law 

p{s, z, A, ^1^) oc p{x\s, A, Oe) p{s\z, Os)p{z) p{A) p{e) 

(9) 

I this paper, we used conjugate priors for all of them, i.e., 
Gaussian for the elements of A, Gaussian for the means m^^ 
and inverse Gamma for the variances cr|^ as well as for the 
noise variances a^i- 

When given the expression of the posterior law, we can 
then use it to define an estimator such as Joint Maximum A 
Posteriori (JMAP) or the Posterior Means (PM) for all the 
unknowns. The first needs optimization algorithms and the 
second integration methods. Both are computationally de- 
manding. Alternate optimization is generally used for the first 
while the MCMC techniques are used for the second. 

In this work, we propose to separate the unknowns in two 
sets (s, z) and (A, 6) and then use the following iterative al- 
gorithm: 

• Estimate (s, z) using p{s, z\A,0, x) by 

p{s\z^ A^O^x) and z^p{z\A^6^x) 

• Estimate (A, 6) using p( A, 6\s_^ £, x) by 

A ^ p{A\s_,z,0,x) and ^ p{0\s_,z, A,x) 

In this algorithm, ~ represents either argmax or generate 
sample using or still compute the Mean Field Approximation 
(MFA). To implement this algorithm, we need the following 
expressions: 

• p{s\z^ A, 6^ x) (X p{x\s^ A, Ee) p{s\z^ 6). 

It is then easy to see that p{s\z^ A, 6^ x) is separable in r: 

P{s\z,0,x) = Y[p{s{r)\z{r),0,x{r)) 

r 

= Y[Af{s{r),B{r)) (10) 



with 



s{r) =B{r)[A^T.-^x{r)^ 



(11) 



In this relation m 



z{r) 



is a vector of size n with all ele- 



ments equal to zero except the /c-th element where k = z{r) 
and Tiz^r) is a diagonal matrix of size n x n with all elements 
equal to zero except the k-\h diagonal where k = z{r). 

• p{z\A^6^x) (X p{x\z, A^O) p{z), where 
p{x\z,A,e) = l[p{x{r)\z{r),A,0) (12) 

r 
r 

It is then easy to see that, even if p{x\z^ A, 6) is separable in 
r, p{z\ A, 6 ^x) is not and it has the same markovian structure 
thatp(z). 

• p{A\z,0,x) oc p{x\z, A,0) p{A). 

It is easy to see that, with a Gaussian or uniform prior for 
p{A) we obtain a Gaussian expression for this posterior law. 
Indeed, with an uniform prior, the posterior mean is equiv- 
alent to the posterior mode and equivalent to the Maximum 
Likelihood (ML) estimate A = Siig msiX a {p{x\z^ A^ 0)} 
whose expression is: 



A = 



'^x{r)s\r) 



'^s{r)s\r) ^ B{r) 



where s{r) and B{r) are given by (fTTI) . 

• p{Re\z,A,e,x)(xp{x\z,A,e)p{Re). 

It is also easy to show that, with an uniform prior on the log- 
arithmic scale or an inverse gamma prior for the noise vari- 
ances, the posterior is also an inverse gamma. 

• p{6\z,A,x) (xp{x\z,A,6)p{6) 

Again here, using the conjugate priors for the means m^^ and 
inverse gamma for the variances cr|^ we can obtain easily the 
expressions of the posterior laws for them. 

Details of the expressions of p{A\z^ 6^ x),p{Re\z^ A^O^x) 
and p{6\z^ A^x) as well as their modes and means can be 
found in |l4jl. 

4. COMPUTATIONAL CONSIDERATIONS AND 
MEAN FIELD APPROXIMATION 

As we can see, the expression of the conditional posterior of 
the sources is separable in r but this is not the case for the 
conditional posterior of the hidden variable z{r). So, even if 
it is possible to generate samples from this posterior using a 
Gibbs sampling scheme, the cost of the computation is very 
high for real applications. The Mean Field Approximation 
(MFA) then becomes a natural tool for obtaining approximate 
solutions with lower computational cost. 



The mean field approximation is a general method for ap- 
proximating the expectation of a Markov random variable. 
The idea consists in, when considering a pixel, to neglect the 
fluctuation of its neighbor pixels by fixing them to their mean 
values. Another interpretation of the MFA is to ap- 

proximate a non separable 



p{z) oc exp 



r r' 



oc '[\p{z{r)\z{r'),r' &V{r)) 

r 

with the following separable one: 

q{z)^\[q{z{r)\z{r'),r' eV{r)) 

r 

where z{r') is the expected value of z{r') computed using 
q{zh). This approximate separable expression is obtained in 
such a way to minimize KL{p^ q) for a given class of separa- 
ble distributions q e Q. 

Using now this approximation in the expression of the 
conditional posterior law p{z\A^O^x) gives the separable MFA 

q{z\A, e,x)=l[ q{z{r)\z{r^), r' e V(r), A, 0, x{r)) 

r 

where q{z{r)\z{r')^r' G V(r), A, ^, ic(r)) = 

p{x{r)\z{r),A,e) q{z{r)\z{r'),r' G V(r)) 
and z{r) can be computed by 



z{r) 



E.(.) <r) q{zir)\z{r'),r' G V{r), A,0,x{r)) 
Ezir)Qi<r)\z{r'),r' eV{r),A,0,x{r)) 

5. SIMULATION RESULTS 



The main objectives of these simulations are: first to show 
that the proposed algorithm gives the desired results, and sec- 
ond to compare its relative performances with respect to some 
classical methods. For this purpose, first we generated some 
simulated data according to the data generatin model, i.e.; 
starting by generating z{r), then the sources s(r), then using 
some given spectral signatures obtained from real materials 
construct the mixing matrix A and finally generate data x{r). 
Fig. 2 shows an example of such data generated with the fol- 
lowing parameters: m = 32^ n = 4^ K = 4 and SNR=20 dB 
and Fig. 3 shows a comparison of the results obtained by two 
classical spectral and image classification methods using the 
classical iiT-means with the results obtained by the proposed 
method. Some other simulated results as well as the results 
obtained on real data will be given in near future. 

6. CONCLUSION 

Classical methods of data reduction in hyperspectral imaging 
use classification methods either to classify the spectra or to 





Fig. 2. Two examples of data generating process: a) z{r) b) 
spectral signatures used to construct the mixing matrix A and 
c) m = 32 images. Upper row: K = 4 and image sizes 
(64x64). Lower row: K = 8 and image sizes (128x128). 



classify the images in K classes where K is, in general, much 
less than the number of spectra or the number of observed im- 
ages. However, these methods neglect either the spatial orga- 
nization of the spectra or the spectral property of the pixels 
along the spectral bands. In this paper, we considered the 
dimensionality reduction problem in hyperspectral images as 
a source separation and presented a Bayesian estimation ap- 
proach with an appropriate hierarchical prior model for the 
observations and sources which accounts for both spectral and 
spatial structure of the data, and thus, gives the possibility to 
jointly do dimensionality reduction, classification of spectra 
and segmentation of the images. 
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Fig. 3. Dimensionality reduction by different methods: a) 
Spectral classification using iiT-means, b) Image classifica- 
tion using i^-means, c) Proposed method. Upper row shows 
estimated z{r) and lower row the estimated spectra. These 
results have to be compared to the original z{r) and spectra 
in previous figure. 
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Fig. 4. Real data: a) Spectral classification using i^T-means, 
b) Image classification using i^T-means, c) Proposed method. 
Upper row shows estimated z{r) and lower row the estimated 
spectra. 



